Methods and systems of function-specific tracing

ABSTRACT

A system and methods are provided for function-specific tracing of a program. In one embodiment, a method includes generating a trace profile identifying one or more functions of a target program, wherein the trace profile identifies one or more functions to trace and depth of tracing for each function to be traced, loading the trace profile and the target program, identifying traced functions in the target program based on the trace profile, patching the target program to call a trace parameter for one or more functions, wherein traced functions are declared at runtime, and observing function calls for traced functions of the application. In this regard, individual functions are traced and debugged on a function-by-function basis without modifying the code or pre-arranging functions so they are traceable. As such, the scope of tracing may be dynamically limited to yield only information that is desired.

BACKGROUND

This application is related to co-pending non-provisional U.S. patentapplication 13/___,___ entitled “Methods and Systems of DistributedTracing,” filed Jan. 28, 2013, and U.S. patent application 13/___,___entitled “Methods and Systems of Generating a Billing Feed of aDistributed Network, filed Jan. 28, 2013.

The present disclosure relates generally to tools for programdevelopment, and more particularly to systems and methods forfunction-specific tracing of programs.

Tracing can be one of the most important tools for program developmentand debugging. Typically, a debugger allows for execution of anapplication to be observed, recorded and used to identify particularproblems with the application. Drawbacks of typical methods and programsfor debugging include the speed of executing the debugging, and barriersto access program. Another drawback is that typical methods and programsfor debugging output too much information. By way of example, thetypical debugger/tracer traces the path of execution through a program.The problem is that most of any typical program includes the libraries,interfaces, and runtimes needed to run the program. Thus, tracking down(or at least identifying) errors in other parts of the program may bedifficult. Unfortunately, most debuggers present all of the informationat a user, including information about parts of a program that aprogrammer did not write.

A conventional approach to debugging is to place breakpoints in the codearound the pieces or portions of code that are of interest, and thenstep through the code part by part until you get through the desiredportion is reached. This approach, however, is time-consuming and itdoesn't solve the problem as debugging of the undesired portions of codestill occurs.

What is desired is a system and method for providing function-specifictracing that allows for the scope and depth of tracing to be controlled.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1a is a simplified diagram of a system.

FIG. 1b is a schematic view illustrating a simplified view of a cloudcomputing system.

FIG. 2 is a schematic view illustrating an information processing systemas used in various embodiments.

FIG. 3 shows a process for instantiating and launching a traceraccording to various embodiments.

FIG. 4 is a method for function-specific tracing according to one ormore embodiments.

FIG. 5 is a method directed to dynamic runtime specific support forfunction-specific tracing.

FIG. 6 is a method directed to bytecode runtime for function-specifictracing.

FIG. 7 illustrates a call flow graph that may be constructed byobserving the message flows.

FIG. 8 is a method directed to machine code runtime forfunction-specific tracing.

FIG. 9 is a graphical representation of patching the memory allocatorfor a function.

FIG. 10 illustrates a block diagram of a function-specific tracingsystem.

DETAILED DESCRIPTION

The following disclosure has reference to tracing and debugging programsand applications, and in particular function-specific tracing. In oneembodiment, function-specific tracing can allow for one or moreindividual functions of a program or application to be traced anddebugged on a function-by-function basis, without modifying the code orpre-arranging the functions to be traced. According to anotherembodiment, the scope of function tracing is dynamically limited. Assuch, debugging and tracing can yield only desired information, incomparison. According to another embodiment, tracing is performed in aless invasive fashion and over less of the overall codebase such thatthe speed of the trace program is closer to normal execution speed. Incertain embodiments, function-specific tracing is performed in adistributed computing environment.

FIG. 1A illustrates a simplified diagram of a distributed application100 that can for which various embodiments of distributed tracingsystems and methods may be implemented. It should be appreciated thatapplication 100 is provided merely as an example and that other suitabledistributed applications, middleware, or computing systems can benefitfrom distributed tracing and/or debugging capabilities described herein.According to one embodiment, application 100 is a cloud service.

According to one embodiment, application 100 includes tracing service105 configured to provide function-specific tracing of one or moreprograms, applications systems or distributed applications. As will bedescribed in more detail below, per-function tracing can providevisibility into the performance, into the causes of errors or bugs, andincrease reliability of an application. By way of example, tracingservice 105 can observe messages within the distributed applicationacross queues and from particular components of the application. Asdepicted in FIG. 1A, tracing service 105 interfaces with message service110 of application 100. Message service 110 connects various subsystemsof the application 100, and message service 110 is configured to passmessages relative to one or more elements of system 100.

System 100 may include one or more subsystems, such as controllers 112and services 117. System 100 may include one or more controllers 112 forthe application to be employed in a distributed architecture, such ascloud computing services. As depicted in FIG. 1A, controllers 112include a compute controller 115 a, a storage controller 115 b, authcontroller 115 c, image service controller 115 d and network controller115 e. Controllers 115 are described with reference to a cloud computingarchitecture in FIG. 1. By way of example, network controller 115 adeals with host machine network configurations and can performoperations for allocating IP addresses, configuring VLANs, implementingsecurity groups and configuring networks. Each of controllers 112 mayinterface with one or more services. As depicted in FIG. 1A, computecontroller 115 a interfaces with compute pool 120 a, storage controller115 b may interface with object store 120 b, auth controller 115 c mayinterface with authentication/authorization controller 120 c, imageservice controller 115 d may interface with image store 120 d andnetwork controller 115 e may interface with virtual networking devices120 e. Although controllers 115 and services 120 are with reference toan open architecture, it should be appreciated that the methods andsystems for tracing may be equally applied to other distributedapplications.

Referring now to FIG. 1B, an external view of a cloud computing system130 is illustrated. Cloud computing system 130 includes tracing service105 and message service 110. According to one embodiment, tracingservice 105 can observe messages of cloud computing system 130 andconstructs a call flow graph within each service and between services ofthe could computing system 130. According to another embodiment,controllers and services of the cloud computing system 130 may includetracing services to transmit message traces in response to sending orreceiving of messages.

The cloud computing system 130 includes a user device 132 connected to anetwork 134 such as, for example, a Transport Control Protocol/InternetProtocol (TCP/IP) network (e.g., the Internet.) The user device 132 iscoupled to the cloud computing system 130 via one or more serviceendpoints 155. Depending on the type of cloud service provided, theseendpoints give varying amounts of control relative to the provisioningof resources within the cloud computing system 130. For example, SaaSendpoint 152 a typically only gives information and access relative tothe application running on the cloud storage system, and the scaling andprocessing aspects of the cloud computing system is obscured from theuser. PaaS endpoint 152 b typically gives an abstract ApplicationProgramming Interface (API) that allows developers to declarativelyrequest or command the backend storage, computation, and scalingresources provided by the cloud, without giving exact control to theuser. IaaS endpoint 152 c typically provides the ability to directlyrequest the provisioning of resources, such as computation units(typically virtual machines), software-defined or software-controllednetwork elements like routers, switches, domain name servers, etc., fileor object storage facilities, authorization services, database services,queue services and endpoints, etc. In addition, users interacting withan IaaS cloud are typically able to provide virtual machine images thathave been customized for user-specific functions. This allows the cloudcomputing system 130 to be used for new, user-defined services withoutrequiring specific support.

It is important to recognize that the control allowed via an IaaSendpoint is not complete. Within the cloud computing system 130 are oneor more cloud controllers 135 (running what is sometimes called a “cloudoperating system”) that work on an even lower level, interacting withphysical machines, managing the contradictory demands of themulti-tenant cloud computing system 130. In one embodiment, thesecorrespond to the controllers and services discussed relative to FIG. 1a. The workings of the cloud controllers 135 are typically not exposedoutside of the cloud computing system 130, even in an IaaS context. Inone embodiment, the commands received through one of the serviceendpoints 155 are then routed via one or more internal networks 154. Theinternal network 154 couples the different services to each other. Theinternal network 154 may encompass various protocols or services,including but not limited to electrical, optical, or wirelessconnections at the physical layer; Ethernet, Fiber channel, ATM, andSONET at the MAC layer; TCP, UDP, ZeroMQ or other services at theconnection layer; and XMPP, HTTP, AMPQ, STOMP, SMS, SMTP, SNMP, or otherstandards at the protocol layer. The internal network 154 is typicallynot exposed outside the cloud computing system, except to the extentthat one or more virtual networks 156 are exposed that control internalrouting according to various rules. The virtual networks 156 typicallydo not expose as much complexity as may exist in the actual internalnetwork 154; but varying levels of granularity can be exposed to thecontrol of the user, particularly in IaaS services.

In one or more embodiments, it is useful to include various processingor routing nodes in the network layers 154 and 156, such asproxy/gateway 150. Other types of processing or routing nodes mayinclude switches, routers, switch fabrics, caches, format modifiers, orcorrelators. These processing and routing nodes may or may not bevisible to the outside. It is typical that one level of processing orrouting nodes is internal only, coupled to the internal network 154,whereas other types of network services may be defined by or accessibleto users, and show up in one or more virtual networks 156. Either of theinternal network 154 or the virtual networks 156 may be encrypted orauthenticated according to the protocols and services described below.

In various embodiments, one or more parts of the cloud computing system130 is disposed on a single host. Accordingly, some of the “network”layers 154 and 156 may be composed of an internal call graph,inter-process communication (IPC), or a shared memory communicationsystem.

Once a communication passes from the endpoints via a network layer 154or 156, as well as possibly via one or more switches or processingdevices 150, it is received by one or more applicable cloud controllers135. The cloud controllers 135 are responsible for interpreting themessage and coordinating the performance of the necessary correspondingservices, returning a response if necessary. Although the cloudcontrollers 135 may provide services directly, more typically the cloudcontrollers 135 are in operative contact with the service resources 140necessary to provide the corresponding services. For example, it ispossible for different services to be provided at different levels ofabstraction. For example, a service 140 a may be a “compute” servicethat will work at an IaaS level, allowing the creation and control ofuser-defined virtual computing resources. In addition to the servicesdiscussed relative to FIG. 1a , a cloud computing system 130 may providea declarative storage API, a SaaS-level Queue service 140 c, a DNSservice 140 d, or a Database service 140 e, or other applicationservices without exposing any of the underlying scaling or computationalresources. Other services are contemplated as discussed in detail below.

In various embodiments, various cloud computing services or the cloudcomputing system itself may require a message passing system. Themessage routing service 110 is available to address this need, but it isnot a required part of the system architecture in at least oneembodiment. In one embodiment, the message routing service is used totransfer messages from one component to another without explicitlylinking the state of the two components. Note that this message routingservice 110 may or may not be available for user-addressable systems; inone preferred embodiment, there is a separation between storage forcloud service state and for user data, including user service state.

In various embodiments, various cloud computing services or the cloudcomputing system itself may require a persistent storage for systemstate. The data store 125 is available to address this need, but it isnot a required part of the system architecture in at least oneembodiment. In one embodiment, various aspects of system state are savedin redundant databases on various hosts or as special files in an objectstorage service. In a second embodiment, a relational database serviceis used to store system state. In a third embodiment, a column, graph,or document-oriented database is used. Note that this persistent storagemay or may not be available for user-addressable systems; in onepreferred embodiment, there is a separation between storage for cloudservice state and for user data, including user service state.

In various embodiments, it is useful for the cloud computing system 130to have a system controller 145. In one embodiment, the systemcontroller 145 is similar to the cloud computing controllers 135, exceptthat it is used to control or direct operations at the level of thecloud computing system 130 rather than at the level of an individualservice.

For clarity of discussion above, only one user device 132 has beenillustrated as connected to the cloud computing system 130, and thediscussion generally referred to receiving a communication from outsidethe cloud computing system, routing it to a cloud controller 135, andcoordinating processing of the message via a service 130, theinfrastructure described is also equally available for sending outmessages. These messages may be sent out as replies to previouscommunications, or they may be internally sourced. Routing messages froma particular service 130 to a user device 132 is accomplished in thesame manner as receiving a message from user device 132 to a service130, just in reverse. The precise manner of receiving, processing,responding, and sending messages is described below with reference tothe various discussed service embodiments. One of skill in the art willrecognize, however, that a plurality of user devices 132 may, andtypically will, be connected to the cloud computing system 130 and thateach element or set of elements within the cloud computing system isreplicable as necessary. Further, the cloud computing system 130,whether or not it has one endpoint or multiple endpoints, is expected toencompass embodiments including public clouds, private clouds, hybridclouds, and multi-vendor clouds.

Each of the user device 132, the cloud computing system 130, theendpoints 152, the cloud controllers 135 and the cloud services 140typically include a respective information processing system, asubsystem, or a part of a subsystem for executing processes andperforming operations (e.g., processing or communicating information).An information processing system is an electronic device capable ofprocessing, executing or otherwise handling information, such as acomputer. FIG. 2 shows an information processing system 210 that isrepresentative of one of, or a portion of, the information processingsystems described above.

Referring now to FIG. 2, information processing system 210 as shown isrepresentative of one of, or a portion of, the information processingsystems described above. Diagram 200 of FIG. 2 shows an informationprocessing system 210 configured to host one or more virtual machines,coupled to a network 205. The network 205 could be one or both of thenetworks 154 and 156 described above. An information processing systemis an electronic device capable of processing, executing or otherwisehandling information. Examples of information processing systems includea server computer, a personal computer (e.g., a desktop computer or aportable computer such as, for example, a laptop computer), a handheldcomputer, and/or a variety of other information handling systems knownin the art. The information processing system 210 shown isrepresentative of, one of, or a portion of, the information processingsystems described above.

The information processing system 210 may include any or all of thefollowing: (a) a processor 212 for executing and otherwise processinginstructions, (b) one or more network interfaces 214 (e.g., circuitry)for communicating between the processor 212 and other devices, thoseother devices possibly located across the network 205; (c) a memorydevice 216 (e.g., FLASH memory, a random access memory (RAM) device or aread-only memory (ROM) device for storing information (e.g.,instructions executed by processor 212 and data operated upon byprocessor 212 in response to such instructions)). In some embodiments,the information processing system 210 may also include a separatecomputer-readable medium 218 operably coupled to the processor 212 forstoring information and instructions as described further below.

In one embodiment, there is more than one network interface 214, so thatthe multiple network interfaces can be used to separately routemanagement, production, and other traffic. In one exemplary embodiment,an information processing system has a “management” interface at 1 GB/s,a “production” interface at 10 GB/s, and may have additional interfacesfor channel bonding, high availability, or performance. An informationprocessing device configured as a processing or routing node may alsohave an additional interface dedicated to public Internet traffic, andspecific circuitry or resources necessary to act as a VLAN trunk.

In some embodiments, the information processing system 210 may include aplurality of input/output devices 220 a-n which are operably coupled tothe processor 212, for inputting or outputting information, such as adisplay device 220 a, a print device 220 b, or other electroniccircuitry 220 c-n for performing other operations of the informationprocessing system 210 known in the art.

With reference to the computer-readable media, including both memorydevice 216 and secondary computer-readable medium 218, thecomputer-readable media and the processor 212 are structurally andfunctionally interrelated with one another as described below in furtherdetail, and information processing system of the illustrative embodimentis structurally and functionally interrelated with a respectivecomputer-readable medium similar to the manner in which the processor212 is structurally and functionally interrelated with thecomputer-readable media 216 and 218. As discussed above, thecomputer-readable media is implemented using a hard disk drive, a memorydevice, and/or a variety of other computer-readable media known in theart, and when including functional descriptive material, data structuresare created that define structural and functional interrelationshipsbetween such data structures and the computer-readable media (and otheraspects of the system 200). Such interrelationships permit the datastructures' functionality to be realized. For example, in one embodimentthe processor 212 reads (e.g., accesses or copies) such functionaldescriptive material from the network interface 214, thecomputer-readable media 218 onto the memory device 216 of theinformation processing system 210, and the information processing system210 (more particularly, the processor 212) performs its operations, asdescribed elsewhere herein, in response to such material stored in thememory device of the information processing system 210. In addition toreading such functional descriptive material from the computer-readablemedium 218, the processor 212 is capable of reading such functionaldescriptive material from (or through) the network 105. In oneembodiment, the information processing system 210 includes at least onetype of computer-readable media that is non-transitory. For explanatorypurposes below, singular forms such as “computer-readable medium,”“memory,” and “disk” are used, but it is intended that these may referto all or any portion of the computer-readable media available in or toa particular information processing system 210, without limiting them toa specific location or implementation.

The information processing system 210 includes a hypervisor 230. Thehypervisor 230 may be implemented in software, as a subsidiaryinformation processing system, or in a tailored electrical circuit or assoftware instructions to be used in conjunction with a processor tocreate a hardware-software combination that implements the specificfunctionality described herein. To the extent that software is used toimplement the hypervisor, it may include software that is stored on acomputer-readable medium, including the computer-readable medium 218.The hypervisor may be included logically “below” a host operatingsystem, as a host itself, as part of a larger host operating system, oras a program or process running “above” or “on top of” a host operatingsystem. Examples of hypervisors include Xenserver, KVM, VMware,Microsoft's Hyper-V, and emulation programs such as QEMU.

The hypervisor 230 includes the functionality to add, remove, and modifya number of logical containers 232 a-n associated with the hypervisor.Zero, one, or many of the logical containers 232 a-n contain associatedoperating environments 234 a-n. The logical containers 232 a-n canimplement various interfaces depending upon the desired characteristicsof the operating environment. In one embodiment, a logical container 232implements a hardware-like interface, such that the associated operatingenvironment 234 appears to be running on or within an informationprocessing system such as the information processing system 210. Forexample, one embodiment of a logical container 234 could implement aninterface resembling an x86, x86-64, ARM, or other computer instructionset with appropriate RAM, busses, disks, and network devices. Acorresponding operating environment 234 for this embodiment could be anoperating system such as Microsoft Windows, Linux, Linux-Android, or MacOS X. In another embodiment, a logical container 232 implements anoperating system-like interface, such that the associated operatingenvironment 234 appears to be running on or within an operating system.For example one embodiment of this type of logical container 232 couldappear to be a Microsoft Windows, Linux, or Mac OS X operating system.Another possible operating system includes an Android operating system,which includes significant runtime functionality on top of a lower-levelkernel. A corresponding operating environment 234 could enforceseparation between users and processes such that each process or groupof processes appeared to have sole access to the resources of theoperating system. In a third environment, a logical container 232implements a software-defined interface, such a language runtime orlogical process that the associated operating environment 234 can use torun and interact with its environment. For example one embodiment ofthis type of logical container 232 could appear to be a Java, Dalvik,Lua, Python, or other language virtual machine. A correspondingoperating environment 234 would use the built-in threading, processing,and code loading capabilities to load and run code. Adding, removing, ormodifying a logical container 232 may or may not also involve adding,removing, or modifying an associated operating environment 234. For easeof explanation below, these operating environments will be described interms of an embodiment as “Virtual Machines,” or “VMs,” but this issimply one implementation among the options listed above.

In one or more embodiments, a VM has one or more virtual networkinterfaces 236. How the virtual network interface is exposed to theoperating environment depends upon the implementation of the operatingenvironment. In an operating environment that mimics a hardwarecomputer, the virtual network interface 236 appears as one or morevirtual, network interface cards. In an operating environment thatappears as an operating system, the virtual network interface 236appears as a virtual character device or socket. In an operatingenvironment that appears as a language runtime, the virtual networkinterface appears as a socket, queue, message service, or otherappropriate construct. The virtual network interfaces (VNIs) 236 may beassociated with a virtual switch (Vswitch) at either the hypervisor orcontainer level. The VNI 236 logically couples the operating environment234 to the network, and allows the VMs to send and receive networktraffic. In one embodiment, the physical network interface card 214 isalso coupled to one or more VMs through a Vswitch.

In one or more embodiments, each VM includes identification data for usenaming, interacting, or referring to the VM. This can include the MediaAccess Control (MAC) address, the Internet Protocol (IP) address, andone or more unambiguous names or identifiers.

In one or more embodiments, a “volume” is a detachable block storagedevice. In some embodiments, a particular volume can only be attached toone instance at a time, whereas in other embodiments a volume works likea Storage Area Network (SAN) so that it can be concurrently accessed bymultiple devices. Volumes can be attached to either a particularinformation processing device or a particular virtual machine, so theyare or appear to be local to that machine. Further, a volume attached toone information processing device or VM can be exported over the networkto share access with other instances using common file sharingprotocols. In other embodiments, there are areas of storage declared tobe “local storage.” Typically a local storage volume will be storagefrom the information processing device shared with or exposed to one ormore operating environments on the information processing device. Localstorage is guaranteed to exist only for the duration of the operatingenvironment; recreating the operating environment may or may not removeor erase any local storage associated with that operating environment.

Having described an example of a distributed application, variousembodiments of methods and systems for function-specific tracing willnow be described with references to FIGS. 3-10. Various embodiments ofthe methods and systems disclosed herein may permit tracing of one ormore functions in a program to a desired depth of the program. Inaddition, a tracer may output one or more of a function call list, andcall stack for tracing and debugging the program while yielding onlyinformation based on a trace profile. A function list and/or call stackmay advantageously show how the program may flow through and beprocessed by various functions, procedures, methods, or other applicableunits of software routines. In various embodiments, such a call stack isconstructed at least in part by tracing function calls and returns,processes, software components, virtual machines, physical machines,software services, and network boundaries, from receiving of requests(e.g., an entry of a call to the API) all the way down to where work asperformed (e.g., at worker units or other back-end processes) and back,as further described herein.

In this regard, various embodiments of the methods and systems mayconstruct a call flow graph (may also be referred herein as a call tree)by observing request and response messages between various components ofa program or application, such as a distributed application. A call flowgraph is used to capture and represent causal relationships betweenprocessing activities of various components. That is, a call flow graphmay encode how a processing activity of one or more components may becaused or triggered by a processing activity of one or more othercomponents.

Turning now to FIG. 3, a diagram showing one embodiment of the processof instantiating and launching a tracer is shown. In FIG. 3, componentsmay each represent a logical unit of processing. In one embodiment,tracer 320 is configured to interface with compute controller 325,wherein compute controller 325 is configured for running a targetprogram. At time 301, a user calls tracer 320 with at least twoarguments: the target program and trace profile 615. According to oneembodiment, the target program is an unmodified release version of code,although the program may have one or more supporting files associatedwith it. At step 302 the user creates a trace profile to include alisting of which functions to trace and to what depth the functionsshould be followed in the program. Trace profile 315 is captured in aconfiguration (e.g., config) file or passed as part of calling tracer320 (e.g., passed on the command line). At step 303, the traceridentifies the traced functions based on the description in the traceprofile 315.

At step 304, tracer 320 patched the target program. In one embodiment,tracer 320 monkey-patches the target program to call into atracer-provided routine at the entrance and exit of each tracedfunction. Tracer 320 modifies the arguments (e.g., args) list to removethe references to tracer 320 and invokes the main function 330 of thetarget program. When a to-be traced function is called by callingfunction 335 of controller 325 and step 306, function intercepts 340intercept the function call for traced function 345, and the call goesto a trace helper 350 at step 307. Trace helper 350 observes the stateof the program, the arguments that were passed in, etc., and recordsthose. The traced function can have different levels of scrutinyapplied; it can run with all interactions observed and recorded, or itcan just run and observe entrance: exit values. When the traced functionis finished executing, the intercepts 640 fix the call stack as if thetrace function had never run.

In certain embodiments, recording the state of the program will occur inanother process so as to not slow the main program more than necessary.For example, as shown in step 308 of FIG. 1, remote trace facility 355is configured to record function calls.

At step 309, the target program returns, and the tracer returns at step310.

Turning now to FIG. 4, a flowchart of a function-specific tracing method400 is illustrated, in accordance with an embodiment of the disclosure.In one example, all or part of function-specific tracing method 400 isperformed to trace and/or debug a program, such as a distributedapplication as described above with respect to FIGS. 1-2.

Method 400 is initiated at block 405 by generating a trace profile(e.g., trace profile 315) identifying one or more functions of a targetprogram. The trace profile may identify one or more functions to traceand the depth of tracing. In one embodiment, the trace profile is aconfiguration file. In other embodiments, the trace profile is passed aspart of calling the tracer. The trace profile may be created by a userto identify specific functions of a program for tracing.

At block 410, the trace profile and the target program are loaded into acontroller or processor for debugging or tracing the target program. Inone embodiment, the target program is an unmodified version of code forthe program.

Traced functions in the target program can be identified at block 415based on the trace profile. In one embodiment, traced functions areidentified based on a description for each function to be traced in thetrace profile. In certain embodiments, traced programs are described bythe function call and/or one or more metaprogramming abstractions, suchas decorators. As will be described herein, the function descriptionsare generally based on the language runtime. For example, functiontracing may be based on one or more of a dynamic runtime, bytecoderuntime, machine code runtime and runtime in general.

According to another embodiment, a traced function is declared via itsimport or function access path, and identified internally the same way.In other certain embodiments, a traced function is declared using abinary-relative address, and identification of a traced function isbased on a binary-relative address.

At block 420, the target program is patched to call a trace parameterfor one or more functions, wherein traced functions are declared atruntime. In one embodiment, patching may include patching the targetprogram to call into a tracer routine at the entrance and exit of eachfunction. According to another embodiment, patching may include wrappinga traced function with a decorator. A decorator can be a function thatexpects another function as a parameter. By wrapping a function with adecorator, each time the original function is called, the decoratedfunction will be called instead. As such, when calling a functionreturned by the decorator, the wrapper is called and arguments for theprogram are passed to the wrapper and in turn may be passed to thedecorated function. Decorators may be employed by one or moreembodiments, including implementations written supporting first classfunctions, such as implementations in Python, Ruby, Clojure, or Scheme,for example. One advantage of wrapping functions with a decorator isthat functions may be traced without requiring the code of the programto be modified. By way of example, not modifying the program code maymean that the program does not have to be modified, but the way in whichthe program is executed at runtime is modified. In that fashion,development and testing of a program is more efficient without requiringprogram or testing breaks to be inserted into code during development.In addition, decorators may be employed to extend the behavior of afunction from an external library or to debug the function.

At block 425, function calls for traced functions of the application areobserved. In one embodiment, observing a traced function includestracing bytecode before execution, and wherein function calls areinserted into the target program based on bytecode manipulation.Observing may include identifying traced functions by one or more of atrace profile by way of output files for debugging, tracing a symboliccall stack, and launching a rebase to call into a trace library.Observing may include recording arguments passed by traced functionsbased on the depth of tracing for each traced function. When patching atblock 420 includes wrapping each function with a decorator, observingincludes observing a decorated function. Observing may also includerecording entrance and exit values of observed functions.

Based on method 400 of FIG. 4, one or more functions of a program aretraced and/or debugged at runtime. Method 400 may also includeoutputting one or more of a list of the function calls for each tracedfunction and call stack of traced function calls.

According to another embodiment, function-specific tracing isimplemented across different types of runtimes, with differentintegration techniques. By way of example, function-specific tracing maytie into the function call architecture of a language. As such, certaindetails are expected to be different between runtimes. According to oneembodiment, function-specific tracing may apply to one or more ofdynamic runtimes, static bytecode runtimes, and machine code runtimes.It should be appreciate that implementation of identifying targetprograms, patching, call interception and stack fixing will requireruntime-specific support.

Referring now to FIG. 5, a flowchart of a function-specific tracingmethod 500 is illustrated, in accordance with an embodiment of thedisclosure. All or part of function-specific tracing method 500 may beperformed to trace and/or debug a program, such as a distributedapplication as described above with respect to FIGS. 1-2.

Method 500 is directed to dynamic runtime specific support. As such, thefunctions for tracing may be observed directly in the source code.Although not shown in FIG. 5, method 500 may include generating a traceprofile and loading a trace profile and target program may into acontroller or processor, as described above with reference to blocks 405and 410. In another embodiment, a parser or import statement is modifiedto modify the code to inject the appropriate tracer hooks.

Method 500 may include declaring trace functions via their path (e.g.,module.submodule.function) at block 505. The target functions may bewrapped with a decorator at block 510 when a target function is importedor when they are first encountered at runtime.

At block 515, the decorated function is called instead of the tracedfunction. The program call stack may be observed at block 520. The callstack is mutable from the machine-code and consists of a stack of frameobjects, the frames including the tracer functions can be removed.

Referring now to FIG. 6, a flowchart of a function-specific tracingmethod 500 is illustrated, in accordance with an embodiment of thedisclosure. All or part of function-specific tracing method 600 may beperformed to trace and/or debug a program, such as a distributedapplication as described above with respect to FIGS. 1-2.

Method 600 is directed to bytecode runtime of the function-specifictracing. As such, the functions for tracing are inspected directly fromsource code. Although not shown in FIG. 6, method 600 may includegenerating a trace profile and loading a trace profile and targetprogram may into a controller or processor, as described above withreference to blocks 405 and 410.

Method 600 may include declaring trace functions via their path (e.g.,module.submodule.function) at block 605. The bytecode may containmetadata to identify functions to be traced. At block 610, the bytecodeis traced before execution and the calls are inserted using a bytecodemanipulation utility, using a function similar to that used foraspect-oriented programming dependency injection.

At block 615, the bytecode is sent to the injected function, not theoriginal function. The program call stack is observed at block 620. FIG.7 depicts a graphical representation of a call stack, in Java forexample.

Referring now to FIG. 7, a graphical presentation is depicted for a callstack 1000 for a Java™ program. According to one embodiment, the callstack of the running Java program is modeled by three interfaces: Frame715 encapsulates the data stored in a single stack frame, such as theoperand stack and local variables; Frame Source 710 encapsulates theallocation and layout of Frames, controlling such things as theargument-passing mechanism; and Context 705, which encapsulates thestorage and management of the call stack as well as the locking logicrequired by synchronized methods. The call stack can be modified to“remove” the calls from the stack by adjusting the frame sourceattribute and the capacity attribute. None of the stack addresses willneed to be manipulated, as the information will still be there on thestack until it is garbage-collected, but it will simply be “skippedover.”

Referring now to FIG. 8, a flowchart of a function-specific tracingmethod 500 is illustrated, in accordance with an embodiment of thedisclosure. All or part of function-specific tracing method 800 isperformed to trace and/or debug a program, such as a distributedapplication as described above with respect to FIGS. 1-2.

Method 800 is directed to machine code runtimes. With machine coderuntimes, most of the metadata associated with original code has beenstripped away in the compilation process. As such, the addition of someloadable metadata helpers can expose the same sort of information asother types of architectures described herein. Although not shown inFIG. 8, method 800 may include generating a trace profile and loading atrace profile and target program may into a controller or processor, asdescribed above with reference to blocks 405 and 410.

In one embodiment, method 800 may include declaring trace functionlocations using a binary-relative address at block 805. In anotherembodiment, a second executable produced by the same program code isincluded alongside the target executable, wherein the second executableincludes debugging metadata. The second executable is implemented in oneor more formats such as DWARF, stabs or .DBG. The functions may then beidentified via the trace profile are located in the debugging outputfiles. In yet another embodiment, bytecode runtime is implemented if thecode is available to trace the symbolic call stack for the code, such asC code. The call may form the state machine can be tracked throughtransitions. When the trace function is reached (based on the statemachine location) that call is redirected to the trace helper library(e.g., trace helper 350).

At block 815, the method includes monkey patching the machine code atruntime by changing the target in a jump instruction. FIG. 9 depicts agraphical representation 900 of monkey patching the memory allocator fora function such as the monkey patching at block 815 of method 800. Atblock 820, the program call stack is observed. Based on theimplementation of the machine code, modification of the programarguments may occur automatically (via the JMP target) or based onredirection when the call occurs. As such, call stack fixing may not berequired as stack frames may not exist in a C program in the same senseas a higher-level runtime. To the extent call stack fixing is required,fixing is similar to JAVA fixing as described above with reference toFIG. 4.

Referring now to FIG. 10, a block diagram is illustrated of afunction-specific tracing system, in accordance with an embodiment ofthe disclosure. The function-specific tracing system is configured toconstruct a call flow graph or a distributed call stack. Thefunction-specific tracing system may comprise, in one embodiment, afunction-specific tracing service 1002, trace profile module 1004,tracer module 1016, trace facility 1018, and trace helper module 1010.In one embodiment, the function-specific tracing system is implementedon top of or as part of, for example, distributed application FIG. 1. Itwill be appreciated that distributed tracing system in no way is limitedto or requires a distributed application, and that distributed tracingsystem is implemented on top of or as part of any other suitabledistributed application, middleware, or computing system to beneficiallyprovide thereto distributed tracing and/or debugging capabilitiesdescribed herein.

In one embodiment, function-specific service 1002 is configured tosubscribe or otherwise attach to one or more message queues 1014 toobserve messages communicated among components 1012A-1012D throughmessage queues 1014. For example, function-specific service 1002 isconfigured to observe messages by performing the subscription-basedobservation techniques and operations described above in connection withblock 1302 of FIG. 13.

In one embodiment, function-specific service 1002 is configured toreceive, from instrumentation points, message traces describing messagesbeing communicated among components 1012A-1012D. In this regard,function-specific service 1012 is configured to merge message tracesfrom different instrumentation points 1016. Further in this regard,function-specific service 1012 is configured to merge message tracesreceived from instrumentation points 1016 with message traces obtainedby observing message queues 1014. For example, function-specific service1002 is configured to implement the merging and message representationtechniques and operations.

In one embodiment, instrumentation points 1016 are located at varioustap points described above with respect to block 1302 of FIG. 13,including an RPC runtime 1018, an ORB 1020, a HTTP or remote databasegateway 1022, and a network protocol stack 1024. In one embodiment,instrumentation points 1016 is configured to generate and transmitmessage traces to function-specific service 1002, the message tracesdescribing request/response messages that pass through the correspondingtap points.

Per-process tracer 1026 may be configured to trace a call stack (e.g.,an execution stack, a runtime stack) of a process of component1012A-1012D by running the process under its environment, in a mannersimilar to call stack tracing in conventional single process tracers ordebuggers. In one embodiment, per-process tracer 1026 is furtherconfigured to transmit a description of the traced call stack tofunction-specific service in a manner described with respect to thedistributed call stack generation. In one embodiment, function-specificsystem 1000 may comprise as many per-process tracers 1026 as the numberof processes that may run in the underlying distributed application. Inother embodiments, there may be per-process tracers for some but not allof the processes that may run in the underlying distributed application.

In one embodiment, function-specific service 1002 may compriseappropriate data structures and related functions for encoding,constructing, and/or storing an observed sequence of messages 1004,per-process call stacks 1006, probabilistic models 1008, and call flowgraphs

1-27. (canceled)
 1. A method of function-specific tracing in adistributed application, the method comprising: generating a traceprofile by observing the processing of a set of one or more inputs to atarget program, the trace profile corresponding to the control flowbetween a plurality of functions invoked in the processing of saidinputs; evaluating the applicability of one or more user-specifiedpruning criteria, the criteria selected from a group including a numberof invocations, the total time spent in a particular function, thenumber of times a function is called, and a depth of the function callstack; pruning the trace profile according to the applicable evaluatedpruning criteria; modifying the target program of the distributedapplication to provide tracing information from the functions identifiedin the trace profile; loading, into a compute controller in adistributed computing system, the modified target program; executing, bythe compute controller, the modified target program; emitting metadatafrom the modified target program according to the trace profile; andcollecting emitted metadata by a monitoring program.
 2. The method ofclaim 1, wherein the monitoring program runs in a different processwithin the same execution environment.
 3. The method of claim 1, whereinthe monitoring program runs in a different execution environmentaccessible over a network.
 4. The method of claim 1, further comprising:generating a second trace profile by observing the processing of asecond set of one or more inputs to a second target program, the secondtrace profile corresponding to the control flow between a plurality offunctions invoked in the processing of the second set of inputs, thesecond set of inputs being associated with the first set of inputs;evaluating the applicability of the user-specified pruning criteria,pruning the second trace profile according to the applicable evaluatedpruning criteria; modifying the second target program of the distributedapplication to provide tracing information from the functions identifiedin the second trace profile; loading, into a compute controller in adistributed computing system, the second modified target program;executing, by the compute controller, the second modified targetprogram; emitting metadata from the second modified target programaccording to the trace profile; collecting second emitted metadata by amonitoring program; and associating the emitted data and the secondemitted data.
 5. The method of claim 4, wherein the modified targetprogram and the second modified target program are executed by the samecompute controller.
 6. The method of claim 4, wherein the modifiedtarget program and the second modified target program are executed bydifferent compute controllers within the same distributed system.
 7. Themethod of claim 4, wherein the monitoring program runs in an executionenvironment accessible to the modified target programs over a network 8.The method of claim 1, wherein elements of the emitted metadata isgrouped and/or correlated.
 9. The method of claim 4, wherein elements ofthe emitted metadata and the second emitted metadata are grouped and/orcorrelated.
 10. A system comprising: a trace profiler to generate atrace profile identifying one or more functions of a target program,wherein the trace profile identifies one or more functions to trace byobserving the execution of the target program in response to a first setof one or more inputs, and correlates an invoked function address with afunction name, and outputs a list of functions corresponding to thecontrol structures traversed within the target program while evaluatingthe first set of one or more inputs; a pruner to modify the traceprofile by removing or correlating one or more function invocations fromthe trace profile according to one or more applicable user-specifiedpruning criteria, the criteria selected from a group including a numberof invocations, the total time spent in a particular function, thenumber of times a function is called, and a depth of the function callstack; a compiler operable to take an input representing the targetprogram and the trace profile and output an instrumented target program;a module to load, into a controller module in a first node ofdistributed computing system, the instrumented target program; acontroller operable to execute the instrumented target programresponsive to the reception of a received set of inputs and outputprogram execution metadata received during the operation of the program;and a monitor for receiving the program execution metadata.
 11. Thesystem of claim 10, further comprising: a second generated trace profilegenerated by the trace profiler's observation of the execution of asecond target program in response to a second set of one or more inputs;a second instrumented target program output by the compiler given aninput representing the second target program and the second traceprofile; a module to load, into a second controller module in a secondnode of distributed computing system, the instrumented target program; asecond controller operable to execute the instrumented target programresponsive to the reception of a second set of received inputs andoutputs second program execution metadata.
 12. The system of claim 11,wherein the first node is the second node.
 13. The system of claim 11,wherein the received set of inputs and the second set of received inputsare associated with a distributed calling tree representing theexecution of a distributed application spanning at least two nodes. 14.The system of claim 10, wherein the monitor is co-located on the firstnode.
 15. The system of claim 11, wherein the monitor is not co-locatedon the first or second node.
 16. The system of claim 11, wherein themonitor correlates the program execution metadata with the secondprogram execution metadata.
 17. A system for profiling distributedapplications, the system comprising: first, second, and third computingnodes, the first, second, and third nodes communicably coupled via anetwork; a first instrumented program executed in a context on the firstnode, and a second instrumented program executed in a context on thesecond node, wherein the instrumented programs have been modified torecord metadata associated with one or more monitored functioninvocations, the monitored functions corresponding to a set of functionsidentified in a first trace profile for the first instrumented programand a second trace profile for the second instrumented program, thetrace profiles associating internal function identifiers with functionnames; a monitoring program executed in a context on the third node;wherein the monitoring program associates function call metadatareported from the first and second instrumented program with themulti-node calling tree associated with the execution of a distributedapplication comprising the first and second instrumented programs. 18.The system of claim 17, further comprising a network monitor,communicably coupled to the third node, operable to record networktraffic between the first and second nodes.
 19. The system of claim 18,wherein the monitoring program further associates information from thenetwork monitor with the execution of the distributed application. 20.The system of claim 17, wherein elements of the function call metadatareported from the first and the second instrumented programs are groupedand/or correlated.